Nparse - a shallow n-gram-based grammatical-phrase parser

نویسنده

  • Alice Carlberger
چکیده

Nparse is a shallow probabilistic unification-based parser for N-best list resorting and the finding of simple grammatical phrases. It is data-driven and robust, allowing both domainspecific and unrestricted-language training. We believe it can be an interesting alternative for use in a synthesis or recognition front end. This parser has been trained for Swedish on a fine-grained set of grammatical-phrase nodes and grammatical features and evaluated on three language domains. A tree bank database has been built and a detailed linguistic assessment performed. Later, these results will be compared with evaluation on a simplified node-and-feature system. Our aim is to find the optimal system complexity for accurately establishing phrase boundaries and phrase types in newspaper text and, ultimately, unrestricted language. For this, a combination of iterative manual training and unsupervised training will be used.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Evaluation of Deep and Shallow Approaches to the Automatic Detection of Common Grammatical Errors

This paper compares a deep and a shallow processing approach to the problem of classifying a sentence as grammatically wellformed or ill-formed. The deep processing approach uses the XLE LFG parser and English grammar: two versions are presented, one which uses the XLE directly to perform the classification, and another one which uses a decision tree trained on features consisting of the XLE’s ...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Shallow-Syntax Phrase-Based Translation: Joint versus Factored String-to-Chunk Models

This work extends phrase-based statistical MT (SMT) with shallow syntax dependencies. Two string-to-chunks translation models are proposed: a factored model, which augments phrase-based SMT with layered dependencies, and a joint model, that extends the phrase translation table with microtags, i.e. perword projections of chunk labels. Both rely on n-gram models of target sequences with different...

متن کامل

Towards Framework-Independent Evaluation of Deep Linguistic Parsers

This paper describes practical issues in the framework-independent evaluation of deep and shallow parsers. We focus on the use of two dependencybased syntactic representation formats in parser evaluation, namely, Carroll et al. (1998)’s Grammatical Relations and de Marneffe et al. (2006)’s Stanford Dependency scheme. Our approach is to convert the output of parsers into these two formats, and m...

متن کامل

N-gram language modeling of Japanese using bunsetsu boundaries

A new scheme of N-gram language modeling was proposed for Japanese, where word N-grams were calculated separately for the two cases: crossing and not crossing bunsetsu boundaries. Here, bunsetsu is a basic grammatical (and pronunciation) unit of Japanese. A similar scheme using accent phrase boundaries instead of bunsetsu boundaries has already been proposed by the authors with a certain succes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999